Exploring Measures of "Readability" for Spoken Language: Analyzing linguistic features of subtitles to identify age-specific TV programs

نویسندگان

  • Sowmya Vajjala
  • Walt Detmar Meurers
چکیده

We investigate whether measures of readability can be used to identify age-specific TV programs. Based on a corpus of BBC TV subtitles, we employ a range of linguistic readability features motivated by Second Language Acquisition and Psycholinguistics research. Our hypothesis that such readability features can successfully distinguish between spoken language targeting different age groups is fully confirmed. The classifiers we trained on the basis of these readability features achieve a classification accuracy of 95.9%. Investigating several feature subsets, we show that the authentic material targeting specific age groups exhibits a broad range of linguistics and psycholinguistic characteristics that are indicative of the complexity of the language used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Representation of Non-Linguistic Sounds in Persian and English Subtitles for the Deaf and Hard-of-Hearing: A Comparative Study

Subtitling for the deaf and hard-of-hearing (SDH) is an area which deserves a special attention as it ena- bles these people to access to the part of the ‘world’ intended for hearing people, including the world of ‘motion pictures’, and particularly movie sounds. Compared to linguistic sounds, non-linguistic sounds have received little attention in the field of translation, although they are in...

متن کامل

THE EFFECT OF STANDARD AND REVERSED SUBTITLING VERSUS NO SUBTITLING MODE ON L2 VOCABULARY LEARNING

Audiovisual material accompanied by interlingual subtitles is a powerful pedagogical tool which can help improve the vocabulary learning of second-language learners. This study was intended to determine whether or not the mode (standard and reversed) of subtitling affects the incidental vocabulary acquisition of Iranian L2 learners while watching TV programs. Forty-five participants were random...

متن کامل

Exploring Aphasia in Kalhori

Objectives: Despite numerous studies conducted to explore the manifestations of aphasia in different languages of the world, language-specific patterns of aphasic patients in Kalhori as a southern dialect of Kurdish spoken in part of Kermanshah Province, Iran, remains largely unpacked. The present study aims at investigating language deficits of a forty-year-old Kurdish-Persian aphasic woman, h...

متن کامل

An Analysis of Audiovisual Subtitling Translation Focusing on Wordplays from English into Persian in the Friends TV Series

Translation of humor and transferring its effect is one of the most challenging tasks of a translator due to the cultural clashes between the source language (SL) and the target language (TL). Accordingly, the pre- sent study aimed to specify the most frequently applied strategies in terms of Delabastita’s wordplay model used in SL and their translation strategy by Persian translators acc...

متن کامل

Named Entities in Indexing: A Case Study of TV Subtitles and Metadata Records

This paper explores the possible role of named entities in an automatic indexing process, based on text in subtitles. This is done by analyzing entity types, name density and name frequencies in subtitles and metadata records from different TV programs. The name density in metadata records is much higher than the name density in subtitles, and named entities with high frequencies in the subtitl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014